OTTO: A Tool for Diplomatic Transcription of Historical Texts
نویسندگان
چکیده
In this paper, we present OTTO, a web-based transcription tool which is designed for diplomatic transcription of historical language data. The tool supports fast and accurate typing, by use of userdefined special characters, and, simultaneously, providing a view on the manuscript that is as close to the original as possible. It also allows for the annotation of rich, user-defined header information. Users can log in and operate OTTO from anywhere through a standard web browser.
منابع مشابه
OTTO: A Transcription and Management Tool for Historical Texts
This paper presents OTTO, a transcription tool designed for diplomatic transcription of historical language data. The tool supports easy and fast typing and instant rendering of transcription in order to gain a look as close to the original manuscript as possible. In addition, the tool provides support for the management of transcription projects which involve distributed, collaborative working...
متن کاملCompetition of Discourses in Journalistic Translation: Diplomatic Negotiations in Focus
We sought to understand whether, how, and why the translated journalistic texts related to the Iranian nuclear negotiations were manipulated. To this end, we monitored a news agency’s Webpage in a time span of 46 days that began 3 days before Almaty I nuclear talks and ended 3 days after Almaty II talks. Monitoring resulted in a corpus made up of 36 target texts p...
متن کاملAn Unsupervised Model of Orthographic Variation for Historical Document Transcription
Historical documents frequently exhibit extensive orthographic variation, including archaic spellings and obsolete shorthand. OCR tools typically seek to produce so-called diplomatic transcriptions that preserve these variants, but many end tasks require transcriptions with normalized orthography. In this paper, we present a novel joint transcription model that learns, unsupervised, a probabili...
متن کاملQuerying Hebrew Texts via Word Spotting
We report on recent results with word-spotting (WS) in Hebrew historical texts, manuscript and printed. The advantage of such a retrieval system is that it works on images without any need for manual or computer transcription of the texts. The method allows for extremely rapid querying, while still maintaining high accuracy; thus, it should be considered as an important tool in historical textu...
متن کاملNamed Entity Recognition Applied on a Data Base of Medieval Latin Charters. The Case of Chartae Burgundiae
The work on the named entity recognition (NER) in databases of historical texts has been placed among the most promising new ways to implement best recovery and managements tools for exploring mass data. In this paper, we describe the application processing NER through a modelling with CRF on an annotated database of Burgundy collection of charters from the tenth to thirteenth centuries. The ai...
متن کامل